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We design a novel algorithm for solving Mean-Payoff Games (MPGs). Besides solving an MPG 
in the usual sense, our algorithm computes more information about the game, information that is 
important with respect to applications. The weights of the edges of an MPG can be thought of as 
a gained/consumed energy - depending on the sign. For each vertex, our algorithm computes the 
minimum amount of initial energy that is sufficient for player Max to ensure that in a play starting 
from the vertex, the energy level never goes below zero. Our algorithm is not the first algorithm 
that computes the minimum sufficient initial energies, but according to our experimental study it is 
the fastest algorithm that computes them. The reason is that it utilizes the strategy improvement 
technique which is very efficient in practice. 

1 Introduction 

A Mean-Payoff Game (MPG) |[T2l [151 [191 is a two-player infinite game played on a finite weighted 
directed graph, the vertices of which are divided between the two players. A play starts by placing a 
token on some vertex and the players, named Max and Min, move the token along the edges of the graph 
ad infinitum. If the token is on Max's vertex, he chooses an outgoing edge and the token goes to the 
destination vertex of that edge. If the token is on Min's vertex, it is her turn to choose an outgoing edge. 
Roughly speaking, Max wants to maximize the average weight of the traversed edges whereas Min wants 
to minimize it. It was proved in |[l2l that each vertex v has a value, denoted by v(v), which each player 
can secure by a positional strategy, i.e., strategy that always chooses the same outgoing edge in the same 
vertex. To solve an MPG is to find the values of all vertices, and, optionally, also strategies that secure 
the values. 

In this paper we deal with MPGs with other than the standard average-weight goal. Player Max now 
wants the sum of the weights of the traversed edges, plus some initial value (initial "energy"), to be 
non-negative at each moment of the play. He also wants to know the minimal sufficient amount of initial 
energy that enables him to stay non-negative. For different starting vertices, the minimal sufficient initial 
energy may be different and for starting vertices with v < 0, it is impossible to stay non-negative with 
arbitrarily large amount of initial energy. 

The problem of computation of the minimal sufficient initial energies has been studied under different 
names by Chakrabarti et al. |[5l, Lifshits and Pavlov (TT\ . and Bouyer et al. |[2l. In |[5l it was called the 
problem of pure energy interfaces, in (TT\ it was called the problem of potential computation, and in lO 
it was called the lower-bound problem. The paper ||3 also contains the definition of a similar problem 
- the lower-weak-upper-bound problem. An instance of this problem contains, besides an MPG, also a 
bound b. The goal is the same. Max wants to know how much initial energy he needs to stay non-negative 
forever, but now the energy level is bounded from above by b and during the play, all increases above 
this bound are immediately truncated. 
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Various resource scheduling problems for which the standard solution of an MPG is not useful can be 
formulated as the lower-bound or the lower-weak-upper-bound problems, which extends the applicability 
of MPGs. For example, an MPG can be used to model a robot in a hostile environment. The weights 
of edges represent changes in the remaining battery capacity of the robot - positive edges represent 
recharging, negative edges represent energy consuming actions. The bound b is the maximum capacity 
of the battery. Player Max chooses the actions of the robot and player Min chooses the actions of the 
hostile environment. By solving the lower-weak-upper-bound problem, we find out if there is some 
strategy of the robot that allows him to survive in the hostile environment, i.e., its remaining battery 
capacity never goes below zero, and if there is such a strategy, we also get the minimum initial remaining 
battery capacity that allows him to survive. 

The first algorithm solving the lower-bound problem was proposed by Chakrabarti et al. IH and it 
is based on value iteration. The algorithm can also be easily modified to solve the lower-weak-upper- 
bound problem. The value iteration algorithm was later improved by Chaloupka and Brim in {7\, and 
independently by Doyen, Gentilini, and Raskin HH, extended version of ||7][Tll was recently submit- 
ted im. Henceforward we will use the term "value iteration" (VI) to denote only the improved version 
from Elm. The algorithms of Bouyer et al. ||21 that solve the two problems are essentially the same as 
the original algorithm from IH. However, El focuses mainly on other problems than the lower-bound 
and the lower-weak-upper-bound problems for MPGs. Different approach to solving the lower-bound 
problem was proposed by Lif shits and Pavlov ifTTl . but their algorithm has exponential space complex- 
ity, and so it is not appropriate for practical use. VI seems to be the best known approach to solving the 
two problems. 

In this paper, we design a novel algorithm based on the strategy improvement technique, suitable for 
practical solving of the lower-bound and the lower-weak-upper-bound problems for large MPGs. The 
use of the strategy improvement technique for solving MPGs goes back to the algorithm of Hoffman and 
Karp from 1966 ||T6| . Their algorithm can be used to solve only a restricted class of MPGs, but strategy 
improvement algorithms for solving MPGs in general exist as well H] [181 [TOl. However, all of them 
solve neither the lower-bound nor the lower-weak-upper-bound problem (cf. Section HI first part, last 
paragraph), our algorithm is the first. Another contribution of this paper is a further improvement of VI. 

The shortcoming of VI is that it takes enormous time on MPGs with at least one vertex with v < 0. 
Natural way to alleviate this problem is to find the vertices with v < by some fast algorithm and run 
VI on the rest. Based on our previous experience with algorithms for solving MPGs ||61, we selected 
two algorithms for computation of the set of vertices with v < 0. Namely, the algorithm of Bjorklund 
and Vorobyov HI (BV), and the algorithm of Schewe lITSl (SW). This gives us two algorithms: VI -i- BV 
and VI -I- SW. However, the preprocessing is not helpful on MPGs with all vertices with V > 0, and it is 
also not helpful for solving the lower-weak-upper-bound problem for small bound b. Therefore, we also 
study the algorithm VI without the preprocessing. 

Our new algorithm based on the strategy improvement technique that we propose in this paper has 
the complexity 0(\V\ ■ {\V\ -loglVj + l^l) - W), where W is the maximal absolute edge-weight. It is 
slightly worse than the complexity of VI, the same as the complexity of VI -i- BV, and better than the 
complexity of VI -i- SW. We call our algorithm "Keep Alive Strategy Improvement" (KASI). It solves 
both the lower-bound and the lower-weak-upper-bound problem. Moreover, as each algorithm that solves 
the lower-bound problem also divides the vertices of an MPG into those with v > and those with v < 0, 
which can be used to compute the exact v values of all vertices, KASI can be thought of as an algorithm 
that also solves MPGs in the usual sense. As a by-product of the design of KASI, we improved the 
complexity of BV and proved that Min may not have positional strategy that is also optimal with respect 
to the lower-weak-upper-bound problem. Moreover, we describe a way to construct an optimal strategy 
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for Min with respect to the lower-weak-upper-bound problem. 

To evaluate and compare the algorithms VI, VI -i- BV, VI -i- SW, and KASI, we implemented them 
and carried out an experimental study. According to the study, KASI is the best algorithm. 

2 Preliminaries 

A Mean-Payoff Game (MPG) |[l2l[Il[T3 is given by a triple F = (G, VMax,^^Min), where G={V,E,w) 
is a finite weighted directed graph such that V is a disjoint union of the sets VMax and VMin> w : E ^ Z 
is the weight function, and each v gV has out-degree at least one. The game is played by two opposing 
players, named Max and Min. A play starts by placing a token on some given vertex and the players 
then move the token along the edges of G ad infinitum. If the token is on vertex v G VMax> Max moves 
it. If the token is on vertex v € Vum, Min moves it. This way an infinite path p = (vo,vi,V2, . . .) is 
formed. Max's aim is to maximize his gain: liminf„^oo ^L"=o '^{'^i^'^i+i)' and Min's aim is to minimize 
her loss: limsup„^„ ^J^J'Jq w(v;,v,+i). For each vertex v € V, we define its value, denoted by v(v), as 
the maximal gain that Max can ensure if the play starts at vertex v. It was proved that it is equal to the 
minimal loss that Min can ensure. Moreover, both players can ensure v(v) by using positional strategies 
defined below |[T2l . 

A (general) strategy of Max is a function a : V* ■ Vuax V such that for each finite path p = 
(vo, . . . , v^) with vj. G Vuax, it holds that {vk,o{p)) € E. Recall that each vertex has out-degree at least 
one, and so the definition of a strategy is correct. The set of all strategies of Max in F is denoted by 
iF. We say that an infinite path p = (vo,vi,V2, . . .) agrees with the strategy a (^iF if for each v,- G VMax, 
a(vo, . . . , V,) = v,+i. A strategy K of Min is defined analogously. The set of all strategies of Min in F is 
denoted by II'^. Given an initial vertex v G V, the outcome of two strategies o ^iF and n G II'^ is the 
(unique) infinite path outcome'^ (v, a, tt) = (v = vo,vi , V2, . . .) that agrees with both a and n. 

The strategy a G is called a positional strategy if o{p) = o{p') for all finite paths p = (vq, . . . , v^) 
and = (vq , . . . , v[, ) such that v^ = v[, G VMax ■ For the sake of simplicity, we think of a positional strategy 
of Max as a function a : VMax V such that (v, a(v)) G £, for each v G VMax- The set of all positional 
strategies of Max in F is denoted by Zj^. A positional strategy n of Min is defined analogously. The set 
of all positional strategies of Min in F is denoted by 11^. We define Ga, the restriction of G to a, as the 
graph {V,Ecj,Wa), where Ea = {{u,v) £ E \ u e Vuin V o{u) = v}, and Wa = w \ Ea- That is, we get Ga 
from G by deleting all the edges emanating from Max's vertices that do not follow a. Gji for a strategy 
71 of Min is defined analogously. For a G Zj^, we also define F^ = (G^, V^Max, V^Min), and for n G IlJ^, 

Fn = (G7i:,VMax,V'Min)- 

The lower-bound problem for an MPG F = (G = (V,^, w), V^Max, V^Min) is the problem of finding 
lb'"(v) G No U {oo} for each v G V, such that: 

Ib^(v) =min{;cGNo I {3o elF){^n (^U^) 

( outcome'"(v, a,7r) = (v = vo,vi,V2, . . .) A 
(V«GN)(^ + i;tJw(v;,v,-+i)>0))} 

where minimum of an empty set is oo. That is, Ib'^(v) is the minimal sufficient amount of initial energy 
that enables Max to keep the energy level non-negative forever, if the play starts from v. If Ib'^(v) = oo, 
which means that v(v) < 0, then we say that Max loses from v, because arbitrarily large amount of initial 
energy is not sufficient. If Ib'^(v) G No, then Max wins from v. 

The strategy a G r'^ is an optimal strategy of Max with respect to the lower-bound problem, if it 
ensures that for each v G V such that Ib'^(v) ^ oo, Ib'^(v) is a sufficient amount of initial energy. 
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The strategy tt G is an optimal strategy of Min with respect to the lower-bound problem, if it 
ensures that for each v G V such that Ib'^(v) 7^ 00, Max needs at least Ib'^(v) units of initial energy, and for 
each V G y such that Ib'^(v) = 00, Max loses. 

The lower-weak-upper-bound problem for an MPG F = (G = (V,£',w), VMaxj^Min) and a bound 
G No is the problem of finding \\Nuh\{v) G No U {00} for each v G V , such that: 

lwub[(v) = min{;c G No | (3a G lF){^n G n'^) 

( outcome'"(v, a,7r) = (v = vo,vi,V2, . . .) A 
(V«GN)(^ + i;Co'w(v,-,v;+i)>0)A 

(V«i,n2 GNo)(ni <«2 ^I-i;>(VnV;+l) > -^) ) } 

where minimum of an empty set is 00. That is, lwub|^(v) is the minimal sufficient amount of initial 
energy that enables Max to keep the energy level non-negative forever, if the play starts from v, under the 
additional condition that the energy level is truncated to b whenever it exceeds the bound. The additional 
condition is equivalent to the condition that the play does not contain a segment of weight less than —b. 
If Iwub)^(v) = 00, then we say that Max loses from v, because arbitrarily lai^ge amount of initial energy is 
not sufficient. If lwub|^(v) G No, then Max wins from v. Optimal strategies for Max and Min with respect 
to the lower-weak-upper-bound problem are defined in the same way as for the lower-bound problem. 

It was proved in 13 that both for the lower-bound problem and the lower-weak-upper-bound problem 
Max can restrict himself only to positional strategies, i.e., he always has a positional strategy that is also 
optimal. Therefore, we could use the set instead of the set iF in the definitions of both the lower- 
bound problem and the lower-weak-upper-bound problem. 

In the rest of the paper, we will focus only on the lower-weak-upper-bound problem, because it 
includes the lower-bound problem as a special case. The reason is that for each v G V such that Ib'^(v) < 
00, it holds that Ib'^(v) < (|y | — 1) • W, where W is the maximal absolute edge- weight in G. It was proved 
in m. Therefore, if we choose = (| V| — 1) • W, then for each v G V, Ib'^(v) = lwub|^(v). 

Let G = {y,E,w) be a weighted directed graph, let p = {va,...,Vk) be a path in G, and let c = 
(mq, • • • , Ur-i, Ur = uq) be a cycle in G. Then w{p), the weight of p, and w(c), the weight of c, ai^e defined 
in the following way: w{p) = ifjo w(v,-, v,-+i), w(c) = Lj'Jo 

Let r = (G = iV,E,w), VMm,VMsLx) be an MPG and let D C y. Then G(D) is the subgraph of G 
induced by the set D. Formally, G(D) = {D,E CiD x D,w \ D x D). We also define the restriction of F 
induced by D. Since some vertices might have zero out-degree in G{D), we define F(D) = (G'(D), V]y[in H 
^)^Max nD), where G'{D) is the graph G(D) with negative self-loops added to all vertices with zero 
outdegree. That is, we make the vertices with zero out-degree in G(D) losing for Max in F(D) with 
respect to the the lower-weak-upper-bound problem. 

Let G = {y,E,w) be a graph and let B,A C V . If we say that is a path from v to B" we mean a 
path with the last vertex and only the last vertex in B, formally: p = {v = vq,. . . ,Vk), where vo, . . . ,Vk-i G 
V \ B A Vyt G B. Furthermore, a path from A to S is a path from v to B such that v G A. The term "longest" 
in connection with paths always refers to the weights of the paths, not the numbers of edges. 

Operations on vectors of the same dimension are element- wise. For example, if do and d\ are two 
vectors of dimension \V\, then do < d\ means that for each v G V, do{v) < d\{v), and for some v G V, 
do{v) < di(y). 

For the whole paper let F = (G = (V, £, w) , VMax > VMin) be an MPG and let W be the maximal absolute 
edge-weight in G, i.e., W = maXeeE \w{e)\. 
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3 The Algorithm 

High-level description of our Keep Alive Strategy Improvement algorithm (KASI) for the lower-weak- 
upper-bound problem is as follows. Let {r,b) be an instance of the lower-weak-upper-bound problem. 
KASI maintains a vector G (ZU {—°°})^ such that —d > is always a lower estimate of IwubJ^, i.e., 
—d < \\Nuh^. The vector d is gradually decreased, and so —d is increased, until —d = IwubJ^. The reason 
why KASI maintans the vector d rather than —d is that d contains weights of certain paths and we find it 
more natural to keep them as they are, than to keep their opposite values. The algorithm also maintains 
a set D of vertices such that about the vertices in V \ D it already knows that they have infinite Iwub^ 
value. Initially, <i = and D = V. KASI starts with an arbitrary strategy 7i G 11^ and then iteratively 
improves it until no improvement is possible. In each iteration, the cunent strategy is first evaluated and 
then improved. The strategy evaluation examines the graph Gn{D) and updates the vector d so that for 
each V E D, it holds 

-^/(v) = lwub['''^'(v) 

That is, it solves the lower-weak-upper-bound problem in the restricted game Tji{D), where Min has 
no choices. This explains why the restricted game was defined the way it was, because if a vertex from 
D has outgoing edges only toV\D, then it is losing for Max in T. The vertices with the d value equal 
to — oo are removed from the set D. Since the strategy n is either the first strategy or an improvement 
of the previous strategy, the vector d is always decreased by the strategy evaluation and we get a better 
estimate of Iwub)^. To improve the current strategy the algorithm checks whether for some (v, u) EE such 
that V S VMin and d{v) > —°° it holds that d{v) > d{u) +w{v,u). This is called a strategy improvement 
condition. Such an edge indicates that —d{v) is not a sufficient initial energy at v, because traversing 
the edge w{v,u) and continuing from u costs at least —w{v,u) — d{u) units of energy, which is greater 
than —d{v) (Recall that —d is a lower estimate of Iwub^). If there are edges satisfying the condition, the 
strategy 7i is improved in the following way. For each vertex v € Vyym such that there is an edge (v, m) G £ 
such that d{v) > d{u) + w{v,u), 7l{v) is switched to u. If v has more than one such edge emanating 
from it, any of them is acceptable. Then, another iteration of KASI is started. If no such edge exists, the 
algorithm terminates, because it holds that each vertex v G V has —d{v) = IwubJ^(v). Detailed description 
of the algorithm follows. 

In Figure [T] is a pseudo-code of the strategy evaluation part of our algorithm. The input to the 
procedure consists of four parts. The first and the second part form the lower-weak-upper-bound problem 
instance that the main algorithm KASI is solving, the MPG T and the bound b G Nq. The third part is the 
sti'ategy n G 11^ that we want to evaluate and the fourth part of the input is a vector G (Z U 
The vector d- \ is such that —d- \ is a lower estimate of Iwu b\, computed for the previous strategy, or, in 
case of the first iteration of KASI, set by initialization to a vector of zeros. Let A = {v G V | <i- 1 (v) = 0} 
and D = {v G y I J_i(v) > —°°}, then the following conditions hold. 

i. Each cycle in Gn{D \A) is negative 

ii. For each v G D \A, it holds that <i-i(v) < and for each edge (v, m) G Eji, i.e., for each edge 
emanating from v in G^, it holds that <i-i (v) > d-\{u) + w(v, u). 

From these technical conditions it follows that —d-\ is also a lower estimate of Iwub^"^^^ and the 
purpose of the strategy evaluation procedure is to decrease the vector d-\ so that the resulting vector d 
satisfies —d = IwubJ^"'^'. To see why from (i.) and (ii.) it follows that —d-\ < Iwubj^"^^^, consider a path 
p = (vo, ■■■,Vk) from D\A to A in Gji{D). From (ii.) it follows that for each j G {0, . . . ,^ — 1}, it holds 
that d-i{vj) > (i_i(vy+i) +w(vy,vy+i). If we sum the inequalities, we get d-i{vo) > d-\{vk) + w{p). 
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Since vt € A, d^i{vk) = and the inequality becomes J_i(vo) > w{p). Therefore, each infinite path in 
G;i (D) starting from v € D and containing a vertex from A has a prefix of weight less or equal to d- 1 (vq) . 
Furthermore, if the infinite path does not contain a vertex from A, weights of its prefixes cannot even be 
bounded from below, because by (i.), all cycles in Gn{D\A) are negative. All in all, —d-i is a lower 
estimate of Iwubj^"^^^. 

The conditions (i.) and (ii.) trivially hold in the first iteration of the main algorithm, for d^i =0. In 
each subsequent iteration, d-\ is taken from the output of the previous iteration and an intuition why the 
conditions hold will be given below. 

The output of the strategy evaluation procedure is a vector J € (Z U {—°°})^ such that for each v G D, 
it holds that -d{v) = \wubl''^^\v). RecaU that D = {v € V | t/_i(v) > -oo}. 

The strategy evaluation works only with the restricted graph G^{D) and it is based on the fact that if 
we have the set B^- = {v € D | lwub^'''^'(v) = 0}, i.e., the set of vertices where Max does not need any 
initial energy to win, then we can compute Iwubj^'^^^^ of the remaining vertices by computing longest 

paths to the set B-. More precisely, for each vertex v € D\B^, IwubJ^'^'^^v) is equal to the absolute value 
of the weight of a longest path from v to B, in Gn{D) such that the weight of each suffix of the path is 
greater or equal to —b. If each path from v to B^ has a suffix of weight less than —b or no path from v to 
B^ exists, then lwub)^''^^^(v) = oo. 

To get some idea about why this holds consider a play winning for Max. The energy level never 
drops below zero in the play, and so there must be a moment from which onwards the energy level never 
drops below the energy level of that moment. Therefore, Max does not need any initial energy to win a 
play starting from the appropriate vertex (Please note that Min has no choices in Tji{D)), and so By is not 
empty. For the vertices in D\B,, in order to win. Max has to get to some vertex in B^ without exhausting 
all of his energy. So the minimal sufficient energy to win is the minimal energy that Max needs to get to 
some vertex in B-. All paths from D\B, to B- must be negative (otherwise B, would be lai^ger), and so 
the minimal energy to get to B, is the absolute value of the weight of a longest path to B, such that the 
weight of each suffix of the path is greater or equal to —b. If no path to B, exists or all such paths have 
suffixes of weight less than —b. Max cannot win. 

Initially, the procedure over-approximates the set B, by the set Bq of vertices v with d-i{v) = 
that have an edge (v, u) such that w(v, u) —d-\ (v) + d-\{u) > emanating from them (line 2), and then 
iteratively removes vertices from the set until it arrives at the con^ect set B^. The vector —di is always a 
lower estimate of lwub|^"'^', i.e., it always holds that —di < \wub^''^'^\ Therefore, only vertices v with 
di{v) = are candidates for the final set B^. However, the vertices v with di{v) = such that for each 
edge (v, u), it holds that w(v, u) — di{v) + di{u) < ai^e removed from the set of candidates. The reason is 
that since di{v) = 0, the inequality can be developed to —w{v,u) — di{u) > 0, and so if the edge {v,u) is 
chosen in the first step, then more than zero units of initial energy are needed at v. During the execution 
of the procedure, dj decreases, and so —di increases, until —di = Iwub)^"^^^ 

In each iteration, the procedure uses a variant of the Dijkstra's algorithm to compute longest paths 
from all vertices to B, on line 4. Since B, is an over-approximation of B,., the absolute values of the 
weights of the longest paths are a lower estimate of Iwubj^"'^'. The weights of the longest paths are 
assigned to di. In particular, for each v € B,-, di{v) = 0. Dijkstra's algorithm requires all edge-weights 
be non-positive (Please note that we are computing longest paths). Since edge-weights are arbitrary 
integers, we apply potential transformation on them to make them non-positive. As vertex potentials we 
use di- 1 , which contains the longest path weights computed in the previous iteration, or, in case / = 0, is 
given as input. Transformed weight of an edge {x,y) is w{x,y) — dj-i (x) + di-i{y), which is always non- 
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1 proc Evaluates TRATEGY (r, b, tt. i ) 

2 / := 0; Bo := {v € y I d-i(y) = AmaX(„„)g£^(w(v,M) - J_i(v) > 0} 

3 while / = V B, i /B, do 

4 di := DlJKSTRA(G;t,^,B;,<i,_i) 

5 /:=/+! 

6 Bi :=B;_i \{v I maX(,,„)g£^(H'(v,M) -(i,-_i(v)+<i,-_i(M)) < 0} 

7 od 

s return (i, 1 
9 end 

Figure 1 : Evaluation of strategy 

positive for tiie relevant edges. In the first iteration of the main algorithm it follows from the condition 
(ii.), and in the subsequent iterations it follows from properties of longest path weights and the fact that 
only vertices with all outgoing edges negative with the potential transformation are removed from the 
candidate set. 

The Dijkstra's algorithm is also modified so that it assigns — oo to each v € D such that each path from 
V to Bi has a suffix of weight less than —b. Therefore, the vertices from which Bi is not reachable or is 
reachable only via paths with suffixes of weight less than —b have dj equal to — oo. Also, vertices from 
V\D have dj equal to — oo. A detailed description of Dijkstra() is in the full version of this paper 131. 

On line 5, the variable / is increased (thus the current longest path weights are now in and 
on line 6, we remove from Bj \ each vertex v that does not have an outgoing edge (v, m) such that 
w(v, u) — di-i (v) + di-i (u) > 0. Another iteration is started only if B,- 7^ B,_i . If no vertex is removed 
on line 6, then B, = B;_i and the algorithm finishes and returns dt-i as output. The following theorem 
establishes the correctness of the algorithm. An intuition why the theorem holds was given above. Its 
formal proof is in the full version of this paper 131 . 

Theorem 1 Let (F, b) be an instance of the lower-weak-upper-bound problem. Let further n € IT^ 
be a positional strategy of Min, and finally let d-\ G (Z U {—00})^ be such that for A = {v dV \ 
d-\{v) = 0} and D = {v € V | <i_i(v) > —00}, the conditions (i.) and (ii.) hold. Then for d := 
EVALUATESTRATEGY(F,Z7,;r,<i_i) it holds thatfor each v e D, d{v) = -lwubl''^'^\v). 

The complexity of EvaluateStrategy() is 0{\V\ ■ {\V\ -loglVl + I^D). Each iteration takes 
0{\V\ ■ log|V| + l^l) because of DlJKSTRA() and the number of iterations of the while loop on 
lines 3-7 is at most \V\, because B, C V loses at least one element in each iteration. 

In Figure |2] is a pseudo-code of our strategy improvement algorithm for solving the lower-weak- 
upper-bound problem using EvaluateStrategy(). The input to the algorithm is a lower-weak-upper- 
bound problem instance (r,b). The output of the algorithm is the vector IwubJ^. The pseudo-code 
corresponds to the high-level description of the algorithm given at the beginning of this section. 

The algorithm proceeds in iterations. It starts by taking an arbitrary strategy from IlJ^ on line |2l and 
initializing the lower estimate of Iwub)^ to vector of zeros on line|3] Then it alternates strategy evaluation 
(line IS and strategy improvement (lines [T0] - [T8] ) until no improvement is possible at which point the main 
while-loop on lines [5]-[T9] terminates and the final d vector is returned on line|20l At that point, it holds 
that for each v G V, —di-i{v) = lwubj;^(v). The whole algorithm KASI is illustrated on Example[T] The 
following lemmas and theorem establish the correctness of the algorithm. 

Example 1 In Figure\3\is an example of a run of our algorithm KASI on a simple MPG. The MPG is in 
Figure\3\(a). Circles are Max's vertices and the square is a Min's vertex. Let's denote the MPG by F, let 
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1 proc LowerWeakUpperBound (r,b) 



2 i := 0; Tlo := Arbitrary strategy from Ylj^ 

3 d-i:=0 

4 improvement '. = true 

5 while improvement do 

6 di := Evaluates TRATEGY (r, b, Tiudi-\) 

7 improvement := false 

8 /:=/+! 

9 7Zi:= n,--i 

10 foreach v G VMin do 

11 if di-\ (v) > — oo then 

12 foreach (v, m) e £■ do 

13 if di-i (v) > dj-i (m) + w(v, m) then 
74 7r,(v) := m; improvement := ^rwe 

76 od 

17 fi 

is od 

79 od 

20 return — t/, i 



2/ end 

Figure 2: Solving the lower-weak-upper-bound problem 

b = 15 and consider a lower-weak-upper-bound problem given by (r,b). Min has only two positional 
strategies, namely, tt' and 71^, where 71^ (vt,) = v\ and 7r^(v3) = V4. Let n = 71^ be the first selected 
strategy. For simplicity, we will use the symbols 71, d, B, and D without indices, although in pseudo- 
codes these symbols have indices, and the set D of vertices with finite d value is not even explicitly 
used. Also, if we speak about a weight of an edge, we mean the weight with the potential transformation 
by d. Figure \3\ illustrates the progress of the algorithm. Each figure denoted by (r.s) shows a state of 
computation right after update of the vector d by DijksTRA(). r is the value of the iteration counter i of 
LowerWeakUpperBound (), and s is the value of the iteration counter i o/EvaluateStrategy(). 
In each figure, the d value of each vertex is shown by that vertex. Edges that do not belong to the current 
strategy 7i of Min are dotted. Detailed description of the progress of the algorithm follows. Initially, 
71 = 71^, d = 0, and D = {vi , V2, V3, V4}. 

There are three vertices in Gjt{D) with non-negative edges emanating from them, namely, vi,V2,V3, 
and so EvaluateStrategy() takes {vi , V2,V3} as the first setB. After the vector d is updated so that it 
contains longest path weights to B (Figure\3\(0.0)), all vertices in B still have non-negative edges, and so 
the strategy evaluation finishes and the strategy improvement phase is started. The strategy improvement 
condition is satisfied for the edge (v3,vi) and so 71 is improved so that 71 = 71^. This completes the first 
iteration ofKASI and another one is started to evaluate and possibly improve the new strategy K. 

Now the vertex V3 does not have a non-negative edge emanating from it, so it is removed from the set 
B and the longest path weights are recomputed (Figure\3\(1.0)). Please note that the only path from V4 to 
B has a suffix of weight less than —b, and so t/(v4) = —0° and V4 is removed from the set D. The update to 
d causes that V2 does not have a non-negative edge, thus it is also removed from the set B and the vector 
d is recomputed again (Figure\3\(l.l )). This finishes the strategy evaluation and strategy improvement 
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(1.1) (2.0) 

Figure 3: Example of a Run of KASI 

follows. The strategy improvement condition is satisfied for the edge (v3,V4), and so the strategy is 
selected as the current strategy 71 again. However, this is not the same situation as at the beginning, 
because the set D is now smaller. Evaluation of the strategy 71 results in the d vector as depicted in 
Figure\3\(2.0). The vertex V3 has d{v^) = because V3 cannot reach the set B, which also results in 
removal of vt, from D. No further improvement of 71 is possible, and so Iwubf^ = —d = (0, 12, 00, 00). 

Lemma 1 Every time Z/?ie|l|o/LowERWEAKUPPERBoUND() is reached, T, b, Ku and di-\ satisfy the 
assumptions ofTheorem\J\ Every time //?ie[7|o/'LowERWEAKUPPERBoUND() is reached and i > 0, it 
holds that di < di-\. 

A formal proof of Lemma[T]is in the full version of this paper [3 ]. The proof uses the following facts. 
The first one we have already used: If p is a path from v to m such that for each edge (x,y) in the path it 
holds that d{x) > d{y) + w{x,y), then d{v) > d{u) -\-w{p), and if for some edge the inequality is strict, 
then d{v) > d{u) + w{p). The second fact is similar" If c is a cycle such that for each edge {x,y) in the 
cycle it holds that d{x) > d{y) + w{x,y), then > w(c), and if for some edge the inequality is strict, then 
the cycle is strictly negative. Using these facts we can now give an intuition why the lemma holds. 

The assumptions of Theorem[Tl conditions (i.) and (ii.), ai^e trivially satisfied in the first iteration of 
LowerWeakUpperBound(), as already mentioned. During execution of EvaluateStrategy(), 
conditions (i.) and (ii.) remain satisfied, for the following reasons. The d values of vertices from D 
are weights of longest paths to B, and so each edge {x,y) emanating from a vertex from D\B satisfies 
d{x) > d{y) +w{x,y). Only vertices with all outgoing edges negative with the potential transformation 
are removed from the set B, i.e., only the vertices with each outgoing edge {x,y) satisfying d{x) > 
d{y) +w{x,y). Using the facts from the previous pai^agraph, we can conclude that all newly formed 
cycles in Gj[{D\B) are negative and the weights of longest paths to B cannot increase. So to complete 
the intuition, it remains to show why the conditions still hold after the strategy improvement and why the 
strategy improvement results in decrease of the d vector. This follows from the fact that the new edges 
introduced by the strategy improvement are negative with the potential transformation. 
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Lemma 2 The procedure LowerWeakUpperBound() always terminates. 

Proof: By Lemma[T] di decreases in each iteration. For each v G V, di{v) is bounded from below by the 
term — — 1) • W, because it is the weight of some path in G with no repeated vertices (Except for the 
case when c?, (v) = — oo, but this is obviously not a problem). Since di is a vector of integers, infinite chain 
of improvements is not possible, and so termination is guaranteed. ■ 

Theorem m is the main theorem of this paper which establishes the correctness of our algorithm. Its 
proof is in the full version of this paper Q. The key idea of the proof is to define strategies for both 
players with the following properties. Letcfi := L0WERWEAKUPPERB0UND(r,Z7). Max's strategy that 
we will define ensures that for each vertex v € V, ds{v) is a sufficient amount of initial energy no matter 
what his opponent does, and Min's strategy that we will define ensures that Max cannot do with smaller 
amount of initial energy. In particular, for vertices with ds{v) = °°, the strategy ensures that Max will 
eventually go negative or traverse a path segment of weight less than —b with ai^biti'arily large amount of 
initial energy. From the existence of such strategies it follows that for each v G V, ds{v) = Iwub^(v), and 
both strategies are optimal with respect to the lower-weak-upper-bound problem. 

The optimal strategy of Max is constructed from the final longest path forest computed by the pro- 
cedure Dijkstra() and the non-negative (with potential transformation) edges emanating from the final 
set B. The optimal strategy of Min is more complicated. 

There is a theorem in ||2l which claims that Min can restrict herself to positional strategies. Unfor- 
tunately, this is not true. Unlike Max, Min sometimes needs memory. Example [U is a proof of this fact, 
because none of the two positional strategies of Min guarantees that Max loses from V3. However, Min 
can play optimally using the sequence of positional strategies computed by our algorithm. In Example [B 
to guarantee that Max loses from V3, Min first sends the play from V3 to V4 and when it returns back to 
V3, she sends the play to vi. As a result, a path of weight —20 is traversed and since b = 15, Max loses. 

In general, let 7ro,;ri, . . . be the sequence of positional strategies computed by the algorithm. Min 
uses the sequence in the following way: if the play starts from a vertex with finite final d value and never 
leaves the set of vertices with finite final d value, then Min uses the last strategy in the sequence, and it 
is the best she can do, as stated by Theorem [T] If the play starts or gets to a vertex with infinite final d 
value, she uses the strategy that caused that the d value of that vertex became —00, but only until the play 
gets to a vertex that was made infinite by some strategy with lower index. At that moment Min switches 
to the appropriate strategy. In particular, Min never switches to a strategy with higher index. 

Theorem 2 Let ds := L0WERWEAKUPPERB0UND(r,Zj), then for each v G V, ds{v) = lwui^{v). 

The algorithm has a pseudopolynomial time complexity: C?(|Vp • {\V\ ■ log \V\ + l^l) • W). It takes 
0(1 • W) iterations until the while-loop on lines [SUTOl terminates . The reason is that for each v G V, if 
d{v) > —0°, then d{v) > —{\V\ — I) -W , because d{v) is the weight of some path with no repeated vertices, 
and so the d vector can be improved at most C?(|y p • IV) times. Each iteration, if considered sepai^ately, 
takes 0{\V\ • {\V\ • log \V\ + \E\)), so one would say that the overall complexity should be 0(|Vp • {\V\ ■ 
log|y| + l^l) - W). However, the number of elements of the set B,- in EvaluateStrategy() never 
increases, even between two distinct calls of the evaluation procedure, hence the amortized complexity 
of one iteration is only 0{\V\ - log \V\ + l^l). 

The algorithm can even be improved so that its complexity is 0{\V\ ■ {\V\ ■ log \V\ + l^l) • W). This 
is accomplished by efficient computation of vertices which which will update their d value in the next 
iteration so that computational time is not wasted on vertices whose d value is not going to change. 
Interestingly enough, the same technique can be used to improve the complexity of the algorithm of 
Bjorklund and Vorobyov so that the complexities of the two algorithms are the same. Detailed description 
of the technique is in the full version of this paper |[3l. 
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4 Experimental Evaluation 

Our experimental study compares four algorithms for solving the lower-bound and the lower-weak- 
upper-bound problems. The first is value iteration 171 [TTl (VI). The second and the third are combinations 
of VI with other algorithm. Finally, the fourth algorithm is our algorithm KASI. We will now briefly 
describe the algorithms based on VI. 

Let (r,^) be an instance of the lower- weak-upper-bound problem. VI starts with do{v) = 0, for each 
V € V, and then computes di^dj, - ■ ■ according to the following rules. 



It is easy to see that for each v G V and k G No, dk{v) is the minimum amount of Max's initial energy 
that enables him to keep the sum of traversed edges, plus dk{v), greater or equal to zero in a ^-step play. 
The computation continues until two consecutive d vectors ai^e equal. The last d vector is then the desired 
vector Iwub)^. lib = {\V\ — \)-W, the algorithm solves the lower-bound problem. The complexity of the 
straightforward implementation of the algorithm is • l^l - W), which was improved in CldTl to 

0{\V\ ■ \E\ ■ W), which is slightly better than the complexity of KASI. 

The shortcoming of VI is that it takes enormous time before the vertices with infinite Ib'^ and Iwub^ 
value are identified. That's why we first compute the vertices with v < by some fast MPG solving 
algorithm and then apply VI on the rest of the graph. For the lower-bound problem, the vertices with v < 
are exactly the vertices with infinite Ib'^ value. For the lower- weak-upper-bound problem, the vertices 
with V < might be a strict subset of the vertices with infinite Iwu b^ value, but still the preprocessing 
sometimes saves a lot of time in practice. It is obvious that on MPGs with all vertices with v > the 
preprocessing does not help at all. It is also not helpful for the lower-weak-upper-bound problem for 
small bound b. 

According to our experiments, partly published in 161, the fastest algorithms in practice for dividing 
the vertices of an MPG into those with v > and V < are the algorithm of Bjorklund and Vorobyov |[T1 
(BV) and the algorithm of Schewe ifTSl (SW). The fact that they are the fastest does not directly follow 
from 10, because that paper focuses on parallel algorithms and computation of the exact v values. 

The original algorithm BV is a sub-exponential randomized algorithm. To prove that the algorithm is 
sub-exponential, some restrictions had to be imposed. If these restrictions are not obeyed, B V runs faster. 
Therefore, we decided not to obey the restrictions and use only the "deterministic part" of the algorithm. 
We used only the modified BV algorithm in our experimental study. We even improved the complexity 
of the deterministic algorithm from 0{\V\^ ■\E\-W) to 0{\V\- {\V\-\og\V\ + \E\) -W) using the same 
technique as for the improvement of the complexity of KASI which is described in the full version of this 
paper ||3l. Since the results of the improved BV were significantly better on all input instances included 
in our experimental study, all results of BV in this paper are the results of the improved BV. 

The complexity of SW is C?(|Vp • {\V\ • log \V\ + \E\) ■ W). It might seem that this is in contradiction 
with the title of Schewe's paper ifTSl . because if some algorithm is optimal, one would expect that there 
are no algorithms with better complexity. However, the term "optimal" in the title of the paper refers 
to the strategy improvement technique. SW is also a strategy improvement algorithm, and the strategy 
improvement steps in SW are optimal in a certain sense. 

We note that any algorithm that divides the vertices of an MPG into those with v > and those 
V < can be used to solve the lower-bound and the lower-weak-upper-bound problem with the help 
of binary search, but it requires introduction of auxiliary edges and vertices into the input MPG and 




if V G VMax Ax<b 
if V G VMin /\x <b 

otherwise 
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repeated application of the algorithm. According to our experiments, BV and SW run no faster than 
KASI. Therefore, solving the two problems by repeated application of BV and SW would lead to higher 
runtimes than the runtimes of KASI. If we use the reduction technique from lU, then BV/SW has to be 
executed &{\V\ -logdVl - W)) times to solve the lower-bound problem, and &{\V\ - logb) times to solve 
the lower-weak-upper-bound. That's why we compared KASI only with the algorithm VI and the two 
combined algorithms: VI -i- BV and VI -i- SW. The complexities of BV and SW exceed the complexity 
of VI, and so the complexities of the combined algorithms are the complexities of BV and SW. 

4.1 Input MPGs 

We experimented with completely random MPGs as well as more structured synthetic MPGs and MPGs 
modeling simple reactive systems. The synthetic MPGs were generated by two generators, namely 
SPRAND 13 and TOR HI, downloadable from |[T4i . The outputs of these generators are only directed 
weighted graphs, and so we had to divide vertices between Max and Min ourselves. We divided them 
uniformly at random. The MPGs modeling simple reactive systems we created ourselves. 

SPRAND was used to generate the "randx" MPG family. Each of these MPGs contains l^l = a; • |V| 
edges and consist of a random Hamiltonian cycle and l^l — |V| additional random edges, with weights 
chosen uniformly at random from [1, 10000]. To make these inputs harder for the algorithms, in each of 
them, we subtracted a constant from each edge-weight so that the V value of each vertex is close to 0. 

TOR was used for generation of the families "sqnc", "Inc", and "pnc". The sqnc and Inc families 
are 2-dimensional grids with wrap-around, while the pnc family contains layered networks embedded on 
a torus. We also created subfamilies of each of the three families by adding cycles to the graphs. For 
more information on these inputs we refer you to |[T3l or 161. Like for the SPRAND generated inputs, 
we adjusted each TOR generated MPG so that the V value of each vertex is close to 0. 

As for the MPGs modeling simple reactive systems, we created three parameterized models. The first 
is called "collect" and models a robot on a ground with obstacles which has to collect items occurring 
at different locations according to certain rules. Moving and even idling consumes energy, and so the 
robot has to return to its docking station from time to time to recharge. By solving the lower-bound, or 
the lower-weak-upper-bound problem for the corresponding MPG, depending on whether there is some 
upper bound on the robot's energy, we find out from which initial configurations the robot has a strategy 
which ensures that it will never consume all of its energy outside the docking station, and we also get 
some strategy which ensures it. For each "good" initial configuration, we also find out the minimal 
sufficient amount of initial energy. We note that the energy is not a part of the states of the model. If it 
was, the problem would be much simpler. We could simply compute the set of states from which Min 
has a strategy to get the play to a state where the robot has zero energy and it is not in the docking station. 
However, making the energy part of the states would cause an enormous increase in the number of states 
and make the model unmanageable. 

The second model is called "supply" and models a truck which delivers material to various locations 
the selection of which is beyond its control. The goal is to never run out of the material so that the truck is 
always able to satisfy each delivery request. We also want to know the minimal sufficient initial amount 
of the material. 

The third model is called "taxi" and models a taxi which transports people at their request. Its 
operation costs money and the taxi also earns money. The goal is to never run out of money, and we also 
want to know the minimal sufficient initial amount of money. 

To get MPGs of manageable size, the models are, of course, very simpUfied, but they are still much 
closer to real world problems than the synthetic MPGs. 
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lower-bound 


lower-weak-upper-bound 


MPG 


VI 


VI + BV 


VI + SW 


KASI 


VI 


VI -^BV 


VI -^SW 


KASI 


sqncO 1 




n/a 


31.22 


55.40 


17.83 


13.28 


19.41 


43.54 


9.06 


sc|nc02 




n/a 


13.30 


20.14 


11.01 


2.88 


10.70 


17.52 


3.57 


sqnc03 


l^ZOZK JZjKj 


n/a 


3.18 


3.54 


1.58 


0.75 


3.20 


3.55 


1.07 


sqiic04 


(ZOZK JJZK) 


n/a 


9.34 


11.49 


8.48 


1.65 


8.55 


10.70 


2.53 


sqnc05 


(ZDZK /oOKJ 


n/a 


10.45 


14.24 


4.89 


1.20 


10.17 


13.95 


1.72 


incui 


(^ZOZK jZ4Kj 


60.79 


67.89 


111.32 


11.31 


17.49 


27.85 


71.19 


5.99 


incuz 


(ZoZK jZ4K) 


57.63 


63.99 


93.89 


10.34 


14.68 


24.04 


53.87 


5.06 


mcU3 


(ZoZK jZjK) 


n/a 


3.30 


4.39 


1.48 


0.73 


3.34 


4.41 


1.03 


in,->n/i 
incu'4 


(^ZOZK JZoK^ 


n/a 


21.05 


25.28 


10.63 


3.53 


11.65 


15.64 


4.24 


incuj 


(^ZOZK /oOK^ 


n/a 


10.89 


11.30 


4.77 


1.17 


10.64 


11.03 


1.65 


pncOl 


(^zozK zuy /KJ 


n/a 


24.27 


16.08 


3.80 


1.41 


24.31 


16.08 


1.98 


pric02 


(^zozK zuy /KJ 


n/a 


25.49 


15.37 


3.80 


1.43 


25.55 


15.38 


1.98 


pric03 


(ZOZK zuy oKJ 


n/a 


23.48 


17.66 


3.86 


1.48 


23.53 


17.66 


2.04 


pnc04 


/^TATL- T 1 m ^^ 
{ZoZK ZlUlK) 


n/a 


26.36 


25.24 


3.91 


1.49 


26.34 


25.23 


2.05 


pnc05 


(ZOZK ZjjyKj 


n/a 


27.09 


29.69 


4.71 


1.97 


27.15 


29.69 


2.51 


rand5 


(ZOZK 1 J lUKj 


n/a 


19.16 


20.42 


4.55 


1.65 


19.27 


20.54 


2.39 


rand5b 


(jZ4K ZOZlKj 


n/a 


36.29 


33.06 


10.09 


3.53 


36.52 


33.24 


5.17 


rand5h 


(iU4oK jZ'+ZK^ 


n/a 


86.55 


59.01 


21.35 


7.45 


87.16 


59.48 


11.07 


rana i u 


l^ZOZK ZOZIK^ 


n/a 


39.30 


36.96 


5.60 


2.37 


39.39 


37.00 


3.68 


nnrll Oh 




n/a 


105.69 


43.43 


14.54 


5.07 


105.97 


43.45 


7.98 


rand 1 Oh 


(1048k 10485k) 


n/a 


140.46 


110.68 


29.27 


11.38 


140.57 


110.82 


17.52 


collect 1 


(636k 3309k) 


996.08 


1027.12 


1032.55 


5.68 


531.40 


544.77 


563.78 


4.89 


collect2 


(636k 3309k) 


338.56 


352.45 


367.12 


5.70 


181.35 


189.17 


208.52 


4.89 


supply 1 


(363k 1014k) 


6956.23 


16.03 


109.72 


1.79 


7.72 


8.87 


102.97 


1.85 


supply2 


(727k 2030k) 


28046.54 


65.08 


449.47 


3.64 


30.84 


33.31 


418.88 


3.77 


taxil 


(509k 979k) 


11.64 


12.85 


13.16 


1.29 


0.70 


1.49 


2.17 


1.38 


taxi2 


(509k 979k) 


6.00 


7.03 


7.51 


1.29 


0.70 


1.49 


2.17 


1.38 



Table 1 : Runtimes of the experiments (in seconds) 



4.2 Results 

The experiments were earned out on a machine equipped with two dual-core Intel® Xeon® 2.00GHz 
processors and 16GB of RAM, running GNU/Linux kernel version 2.6.26. All algorithms were imple- 
mented in C-i~i- and compiled with GCC 4.3.2 with the "-02" option. 

Table[T]gives the results of our experiments. The first column of the table contains names of the input 
MPGs. Numbers of vertices and edges, in thousands, are in brackets. The MPGs prefixed by "sqnc", 
"Inc", and "pnc" were generated by the TOR generator. They all contain 2'^ vertices. The MPGs 
prefixed by "rand" were generated by the SPRAND generator Both for the randS and randlO family, we 
experimented with three sizes of graphs, namely, with 2^^ vertices - no suffix, with 2^*^ vertices - suffix 
"b", and with 2^^ vertices - suffix "h". Finally, the MPGs prefixed by "collect", "supply", and "taxi" 
are the models of simple reactive systems created by ourselves. For each model, we tried two different 
values of parameters. 
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Each MPG used in the experiments has eight columns in the table. Each column headed by a name 
of an algorithm contains execution times of that algorithm in seconds, excluding the time for reading 
input. The term "n/a" means more than 10 hours. The first four columns contain results for the lower- 
bound problem, the last four columns contain the results for the lower-weak-upper-bound problem, which 
contains a bound Z? as a part of the input. If the bound is too high, the algorithms essentially solve the 
lower-bound problem, and so the runtimes are practically the same as for the lower-bound problem. If 
the bound is too low, all vertices in our inputs have infinite IwubJ^ value, and they become very easy to 
solve. We tried various values of b, and for this paper, we selected as b the average Ib'^ value of the 
vertices with finite Ib'^ value divided by 2, which seems to be a reasonable amount so that the results 
provide insight. We note that smaller b makes the computation of VI and KASI faster. However, the B V 
and SW parts of VI -i- BV and VI -i- SW always perform the same work, and so for ^ (jV | — 1) • IV, the 
combined algorithms are often slower than VI alone. 

The table shows that the algorithm KASI was the fastest on all inputs for the lower-bound problem. 
For the lower-weak-upper-bound problem it was never slower than the fastest algorithm by more than a 
factor of 2, and for some inputs it was significantly faster. This was true for all values of b that we tried. 
Therefore, the results clearly suggest that KASI is the best algorithm. In addition, there are several other 
interesting points. 

VI is practically unusable for solving the lower-bound problem for MPGs with some vertices with 
V < 0. Except for lncOl-02, collectl-2, and taxi 1-2, all input MPGs had vertices with v < 0. The 
preprocessing by BV and SW reduces the execution time by orders of magnitude for these MPGs. On 
the other hand, for the lower-weak-upper-bound problem for the bound we selected, VI is often very fast 
and the preprocessing slows the computation down in most cases. VI was even faster than KASI on a lot 
of inputs. However, the difference was never significant, and it was mostly caused by the initialization 
phase of the algorithms, which takes more time for the more complex algorithm KASI. Moreover, for 
some inputs, especially from the "collect" family, VI is very slow. VI makes a lot of iterations for the 
inputs from the collect family, because the robot can survive for quite long by idling, which consumes 
a very small amount of energy per time unit. However, it cannot survive by idling forever. The i-th 
iteration of VI computes the minimal sufficient initial energy to keep the energy level non-negative for 
/ time units, and so until the idling consumes at least as much energy as the minimal sufficient initial 
energy to keep the energy level non-negative forever, new iterations have to be started. We believe that 
this is a typical situation for this kind of application. Other inputs for which VI took a lot of time aie: 
sqncOl, lncOl-02, supply 1-2. 

Finally, we comment on scalability of the algorithms. As the experiments on the SPRAND generated 
inputs suggest, the runtimes of the algorithms increase no faster than the term |V| • l^l, and so they are 
able to scale up to very large MPGs. 

5 Conclusion 

We proposed a novel algorithm for solving the lower-bound and the lower-weak-upper-bound problems 
for MPGs. Our algorithm, called Keep Alive Strategy Improvement (KASI), is based on the strategy 
improvement technique which is very efficient in practice. To demonstrate that the algorithm is able to 
solve the two problems for large MPGs, we carried out an experimental study. In the study we compared 
KASI with the value iteration algorithm (VI) from ITKTTl, which we also improved by combining it with 
the algorithm of Bjorklund and Vorobyov HI (BV) and the algorithm of Schewe (SW). KASI is the clear 
winner of the experimental study. 

Two additional results of this paper are the improvement of the complexity of BV, and characteriza- 
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tion of Min's optimal strategies w.r.t. the lower-weak-upper-bound problem. 
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